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Abstract — The achievable and converse regions for sparse 
representation of white Gaussian noise based on an overcomplete 
dictionary are derived in the limit of large systems. Furthermore, 
the marginal distribution of such sparse representations is also 
inferred. The results are obtained via the Replica method which 
stems from statistical mechanics. A direct outcome of these results 
is the introduction of sharp threshold for ^ -norm decoding 
in noisy compressed sensing, and its mean-square error for 
underdetermined Gaussian vector channels. 

I. Introduction 

White Gaussian noise (WGN) is a canonical component 
in innumerable systems, models and applications related to 
information and signal theory. In this contribution, we study 
sparse representations of WGN. Sparse representations are 
based on dictionary matrices. The columns of the dictionary 
are termed atoms. Often, the atom elementary signals are cho- 
sen from, a so called, overcomplete dictionary for which the 
number of atoms exceeds the dimension of the signal space. 
Such a dictionary exhibits a full-rank and fat matrix. Sparse 
representation of a signal defines a linear combination of only 
a few atoms of the dictionary. In this paper we concentrate 
on zero mean, unit variance, identically and independently 
distributed (i.i.d.) dictionaries (like the Gaussian or Bernoulli 
dictionaries) in the limit of infinite dimensions. 

Although being non-compressible, WGN vector realizations 
do have sparse representations. For example, consider a trivial 
sparse representation (based on a Gaussian dictionary) with 
Hamming weight equal to the dimension of the WGN vector. 
Such a sparse representation is clearly achievable since the 
resultant square dictionary matrix is almost surely invertible 
in the large-system limit. This square matrix is generated 
by concatenating together the columns of the original fat 
dictionary matrix corresponding to the non-zero entries of 
the sparse representation. Evidently, denser representations are 
also attainable, up to the fully dense representation. On the 
other hand, it is clear that a sparse representation consisting 
of only a single atom does not necessarily exist. That is true 
since the probability the WGN realization is identical to one 
of the atom columns of the Gaussian dictionary instance has 
a measure zero. 



This observation raises some interesting, yet so far unre- 
solved questions: 

1) Can one go below the trivial sparsity when representing 
WGN? 

2) if so, what is the minimal Hamming weight delimiting 
the achievable and converse regions of WGN sparse 
representations? 

3) and how do such representations look like in the sense 
of probability density function? 

In this paper, we answer these fundamental questions by 
utilizing the Replica method born from the study of disordered 
systems in statistical physics. The Replica method (TJ, despite 
being mathematically non-rigorous (2), has proven itself as a 
powerful tool in solving open problems in information theory 
and communications (see, e.g. , | 3 ]), and particularly in the 
analysis of sparse signals (4)-|[8) . One immediate consequence 
of our analysis of WGN sparse representation is in revealing 
a sharp threshold for i?o-norm decoding in noisy compressed 
sensing |9|-|[TT| and its mean-square error performance. 

The paper is organized as follows. The problem formulation 
is first introduced in Section [TTJ Section [III] investigates the 
characteristics of sparse representation of WGN in the large- 



system limit, while in Section IV the latter is applied for the 
analysis of £ - n orm decoding in noisy compressed sensing. 
We conclude in Section Ivl 

II. Problem Formulation 

Let w G W 71 be a WGN vector of dimensions m G N* 
with i.i.d. entries Wi ~ A/"(0, 1), i = 1, . . . , Consider an 
overcomplete dictionary D G ]R mxn with n = m/a G N* 
atoms and zero mean, unit variance, i.i.d. entries Dij, j = 
l,...,n. Examples for such dictionaries are the Gaussian, 
~ A/"(0, 1), and Bernoulli, ± 1, dictionaries. The 
scalar a G (0, 1) is termed measurement ratio. The vector 
w and matrix D are statistically independent. The realizations 
of the WGN, w, and dictionary, D, are denoted by uj and X>, 
respectively. 

x The symbols and {•}«</ denote entries of a vector and matrix, 

respectively. 



Let z(u>, *D) G M n be a representation of the WGN instance, 
u>, via a certain overcomplete dictionary, X>, namely 

w = -^2>z(<*>,2>). (1) 

The representation vector z(u>,2>) = z^(u;,X>), explaining 
an observed u; given the dictionary X>, is termed ft- sparse 
representation if at most k = ftn G N* of its entries are 
non-zero, i.e. ||z(o;, X>)||o/n < ft, where ft G (0,1) is 
the sparsity fraction. The £q norm of a vector is defined as 
||z||o^#{ie{l,...,n}|z i ^0}. 

Hereinafter in this paper, a large-system limit is assumed: 
The WGN dimension and the number of dictionary atoms go 
to infinity, i.e. m, n — » oo respectively, but with fixed sparsity 
fraction, ft, and measurement ratio a. 

The central question under investigation in this paper is as 
follows. What is the normalized Hamming weight, or sparsity 
fraction ft* , of the sparsest representation of WGN based on 
an admeasurement dictionary in the limit of large systems. 
Mathematically speaking, we are targeting at ft* such that 

z K * (cj,X>) = argmin ||z||o subject to u = — ^X>z. (2) 

y/n 

The minimal normalized Hamming weight, ft* , as expressed 
in ([2j> is a function of the specific realizations u> and T>. 
Owing to the self-averaging property 1 1 ], one can alternatively 
define ft* in terms of averaging over all realizations, making 
it amenable to evaluation. The self- averaging property, in the 
context of our analysis, is described in the following assump- 
tion. Note that herein the symbol E.{-} denotes expectation of 
the random object within the brackets with respect to (w.r.t.) 
the subscript random variables. 

Assumption 1. The limit ft* = lim n ^ 00 ft* (a;, T>) exists 
and it is equal to its average over the randomness of the 
WGN and dictionary, lim n ^ 00 E Wj d{/^* (w, D)}, for almost 
all realizations of the WGN and dictionary. 

Based on Assumption [T] the next section is dedicated to the 
explicit computation of the minimal sparsity fraction, ft* . This 
minimal Hamming weight is the key for better understanding 
of the principal characteristics of WGN sparse representations 
as described in the rest of this contribution. 

III. Sparse Representation of WGN 

A. Achievable and Converse Regions 

The fundamental questions [T]) and [2]) brought up in the 
Introduction are addressed in the following claim. 

Claim 2. Consider the scalar s a G (0, 1) and 

ft* = 2Q(£) G (0, 1), where £ > is determined by 

a J t 2 exp (-t 2 /2)dt, (3) 

and Q(£) = J^°° dt / ^/27r ex.p (— 1 2 /2) is the Q-f unction. 
Then, with probability 1 in the large-system limit, for a zero 
mean, unit variance, i.i.d. dictionary with measurement ratio 
a : 




Sparsity fraction k 



Figure 1. Illustration of Claim [2] Achievable (shaded) and non-achievable 
(unshaded) regions for ^-sparse representation of WGN via dictionary with 
measurement ratio a in the large- system limit. Solid curve denotes the sharp 
threshold, «* , delimiting between the two regions. Dashed line denotes the 
trivial sparsity threshold, while the circles mark the threshold obtained from 
simulations. 

i) (minimal Hamming weight) the sparsest WGN represen- 
tation is K* a -sparse; 

ii) (achievable region) K-sparse representation of WGN exists 
only for ft > ft*; 

Hi) (converse region) n-sparse representation of WGN does 
not exist for n < ft* . 

Figure [T] illustrates the outcome of Claim [2] Interestingly, 
for a given measurement ratio a, we can go below the trivial 
a-sparse representation down to the sparsest representation 
which is ft* -sparse. Note also that for the well-posed case 
of a = 1, one gets k* — 1 as expected, that is only a dense 
representation exists. The proof of Claim [2] is as follows. 
Proof: 

Define an energy or cost function 

£ K (z,u>,X>,ra,n) = — [—= y^V^Zj - ujA . (4) 
i=i v j=i 

Note in passing, that the vector z G R n is used in the definition 
of the cost function ([?]), rather than z, since the latter denotes 
a (sparse) representation ([T]) which may not necessarily exist 
for certain values of ft and instances u> and T>. 

The energy function ^ can be rewritten equivalently as 

^((,b,w,D,m,n) = — y2 i^^2' D ij b j ( j - uA , (5) 
m i=i Ky/n j=i J 

where Zj = bjQ, ( G M n and the binary variable bj G {0, 1} 
has a Bernoulli parameter ft = {Y^j=i^j) l n - Based on 
Assumption [T] on the self-averageness in the large-system 
limit, one can state that almost surely 

£^(C,b,u?,X>,ra,n) ^> lim E W;D {£ K (C, b, u, T>. m, n)} 
= £ K (t,b,a). (6) 
The minimal energy can be obtained from the limit of zero 



temperature 1 //3 = 



1 8(0^, a)) 
~ i lm S« ' 



(7) 



where J 7 is the normalized average free energy 

-AF(/3,a)^ lim -E w , D {log2( W , X>, /?, m, n)}, (8) 

n— )-oo 72 

and the partition function is defined as 

n poo n 

b j = l J -°°j = l 

x exp ( - /3ra£«(C,b,u;,^,ra,n)). 



The function £(•) denotes the Kronecker delta. 

In the limit of zero temperature, /3 — >• oo, {7} only the vector 
z which minimizes the energy £ K will eventually contribute to 
the partition function Z and consequently to the free energy 
T. Other solutions vanish exponentially in the summation ([9]). 
Based on the definitions of the sparse representation ([T]) and 
energy function ([4]), observe that a zero minimal energy, 
£™ m = 0, implies the existence of ft- sparse representations, 
while evidently for £ ™ n > there is no such representation. 

The quenched average in ([8} is too complicated to be 
computed directly, thus it is carried out via the Replica 
method (TJ. This method relies on the mathematical identity 

E w , D {log^} = lim - logE w , D {Z r }. (10) 

According to the replica trick, one first evaluates E w ^{Z r } 
for integer r and then continues analytically to r = 0. 
Applying a standard replica analysis, and particularly a replica- 
symmetric (RS) ansat^] a la Gardner and Derrida (l2) , fl3| , 
the minimal energy gets the form 



£f («) = lim 



1 + Q + /3(Q - qf 



P^oo 2(1 + /3(Q-q)) 
where the squared £2 -norm of z is 



2 ' 



(11) 



(12) 



and 



1 n 

is the replica- symmetric (i.e. , independent of replica indices 
a, 6 G N*) physical order parameter. The latter parameter 
measures the overlap between to different solutions z a and z b . 
The Q and q (and other saddle-point) parameters are obtained 
by the extremization of J^s, the replica- symmetric average 
free-energy density. For a finite scalar 



x±p(Q-q), 



(14) 



the minimization of J^s w.r.t. Q yields the squared norm of 
the optimal solution 



Q 



at 



where the threshold 



* K = yJ?-J t 2 ex V (-t 2 /2)dt, 



and £ is the unique solution of 



]J^- J dtexp(-t 2 /2) = ft. 



Hence in the region 



a > a K 



the minimal energy boils down to 

£ K (a) 



a 



* >0, 



(15) 



(16) 



(17) 



(18) 



(19) 



1 + Q a 

implying that there is no sparse representation in this region. 



Also, in this region of finite auxiliary parameter x ( 14 ), since 
f3 — » 00 we get Q = q meaning that there is only a single 
solution z for which the energy gets its minimal value £™ in . 
Instead of expressing this region in terms of inequality on the 
measurement ratio ( [T8] ), one can describe it using the sparsity 
fraction of the solution z by requiring 



ft < ft^ 



(20) 



where the sparsity threshold 



dtexp(-t 2 /2), 



(21) 



and this time the integral limit £ is determined by solving 



t 2 exp(-t 2 /2)dt, 



(22) 



2 The RS ansatz is locally stable and conjectured to be globally stable also. 



analogously to ( [T6| ) and ( [17] ). This yields the converse region 
claim pH). 

On the other hand for the complementary case of ft > ft* , 
one finds x — )> 00 ( [T4] ), thus from the minimal energy expres- 
sion ( 1 1 ) we learn that in this region £ ™ m = 0, establishing the 
achievability statement [11]). In the region, since Q ^ q there is 
an infinite number of zero-energy solutions per (ft, a) -point. 
Hence, the threshold ft = ft* describes, with probability 1, 
the minimal possible Hamming weight of the sparsest WGN 
representation, as claimed in This concludes the proof. ■ 

B. Achieving Distribution 

We now aim at question [3]) from the Introduction and ex- 
amine the achieving probability density function of the WGN 
ft-sparse representation via an admeasurement dictionary. The 
results are summarized in the next claim. 

Claim 3. The marginal probability density function of the 
j'th (non-zero) entry, Q, of the (minimal £2-norm) K,-sparse 
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Figure 2. Illustration of Claim [3] The probability density function 
(pdf) of the non-zero entries, Cj, of the WGN O = 0.1)- 
sparse representation is shown for several measurement ratios 
a = 0.1 (magenta), 0.2 (blue), 0.3 (red), 0.4 (green). Only the positive 
C-axis of the symmetric pdf is drawn for compactness. The inset depicts the 
corresponding points in the achievable region (the full circle colors match 
their counterparts in the pdf plots). The maximal measurement ratio possible 



in this case, a. 



o.i 



' 0.44, is denoted by a full square. 



representation of WGN, z K , is given in the large-system limit 
by 



if\Q\<Lh^^ 



2-KOLK 2 



exp 



otherwise 



(23) 

where a* ([76]) is the achievability threshold for given sparsity 
fraction ft. 

Proof: 

The proof of Claim [3] is also based on replica analysis and 
is deferred to another publication. ■ 

Figure [2] gives some examples of densities for different 
(ft, a) -points. A few remarks on Claim [5] are in place. Note that 
the derived probability density function is only the marginal 
one, p(Q). What is the joint distribution of £ is an interesting, 
yet challenging open question. In the large-system limit the 
achieving probability density is not a function of the WGN 
vector and dictionary realizations. There is an infinite number 
of sparse representations per (ft, a) -point in the achievable 
region. Furthermore, the stated achieving distribution cor- 
responds to the representation with the minimal £2 norm. 
Observe that, remarkably, as a — ^ a*, the measure zero gap 
in the probability densities increases to infinity, and the non- 
vanishing part of the distribution becomes more uniform. 

According to Claim [3] a given measurement ratio a de- 
termines an optimal "compression" rate ft* . For this rate the 
sparsest representation consists of ft* n entries of infinite value. 
The locations of these infinite spikes, distributed uniformly 
over the n entry indices, are dictated by the specific WGN 
instance. The rest of the (1 — ft* )n entries are set to zero. 
Thus the number of non-zero entries describing the noise 
instance is reduced w.r.t. its original length, i.e. k < m. 
However since their locations are noise-realization dependent, 



Figure 3. Experimental study for a = 0.5. The (normalized) Hamming 
weight of the sparsest representation of WGN is averaged out (empty circles) 
from computer simulations versus 1/n. Quadratic fitting is used to extrapolate 
the Hamming weight in the limit of large systems k;J x (full circle). The 
theoretical n* h in this case, derived from Claim[2] is also marked (full square) 
for comparison. 



an n > m-length representation is still required. Hence this is 
only a so called compressed sparse representation rather than 
a real compression, which, as is well known, does not exist 
for WGN. 

C. Experimental Results 

The theoretical minimal Hamming weight (Claim [2]) is 
corroborated by computer simulations. First, WGN vector, u, 
and Gaussian dictionary, X>, are being generated. Then the 
sparsest representation ^ per realization is inferred using 
the iteratively reweighted least-squares (IRLS) method |T4| . 
The IRLS method is a tractable approximation of £o-norm 
minimization ([2]). The sparsity fraction of the simulated spars- 
est representation is then averaged over sufficiently large 
ensemble of realizations. This procedure is being repeated 
for several number of atoms, n, varying from 40 up to 200. 
Figure [3] displays the averaged simulated minimal sparsity 
fraction versus the reciprocal of the number of dictionary 
atoms, 1/n, for an example case of a = 0.5. A quadratic 
fitting is then applied so to extrapolate the minimal Hamming 
weight for infinite n (i.e. , the crossing point with the vertical 
axis at 1/n = 0). Figure [T] presents the extrapolated minimal 
sparsity fraction, ft^,, as a function of the measurement ratio, a, 
range. Note that a fairly good agreement is obtained between 
the simulation-based ft* x and the theoretical curve ftj. The 
discrepancy between theory and simulations may be explained 
by first extrapolation errors and second by the fact that IRLS 
method is only an approximation to ^o-norm optimization due 
to the tendency of the former to converge to local, rather than 
global minimum. 

Another empirical study is devoted to affirming the derived 
marginal achieving probability (Claim [3]). To this end, we 
first generate a Gaussian dictionary, X>, and a sparse repre- 
sentation, z. The latter is being generated according to the 
stochastic profile provided by Claim [3] (i.e. , the corresponding 
probability densities as depicted, for instance, by the blue 
Gaussian tail in Figure [2]). The non-zero entry locations in 




(b) noiseless channel 

Figure 4. Experimental study of achieving distribution for the case of a = 

0.2 and k = 0.1. The histogram values of the generated noise are plotted Figure 5. Mapping a ^-sparse noisy channel to an equivalent 
(x -marks) as a function of the values of a standard Gaussian distribution. The ( Kx + ^* - k x k^ )-sparse noiseless channel, 
solid line displays the case where the histogram would be exactly Gaussian. 



the sparse representation are chosen uniformly at random. 
Thus, the vector w = Dz/y^ is being calculated. Repeating 
this process for a large ensemble of realizations, we build a 
histogram for the vector w. Figure [4] displays the values of a 
standard Gaussian distribution, A/*(0, 1), in the horizontal axis 
and the values of the obtained histogram in its vertical axis. 
For the case the histogram would be a perfect Gaussian, then 
the x -markers should fall exactly on the reference solid line. 
One may observe that the experimentally generated noise, w, 
is approximately Gaussian, as the x -marks fall in the vicinity 
of the straight line. Note that this mismatch is very well 
expected, mainly due to the inaccurate implicit assumption, 
taken in this simulation study, about the non-zero entries of 
the sparse representation being i.i.d. and taken only based on 
the marginal probabilities, rather than the joint one (which 
is unfortunately unknown). Furthermore, this empirical test 
does not say anything whether the generated (approximately) 
Gaussian noise vector, w, is white or not. In the next section, 
we explore some consequences of what we have learned so far 
on sparse representations of WGN on the analysis of /o-norm 
optimization in noisy compressed sensing (CS, |9|-|TT|). 

IV. Noisy Compressed Sensing 

Consider a k x G (0, 1) -sparse data vector, x K G R n , with 
a finite second moment. Suppose a noiseless zero mean, unit 
variance and i.i.d. (e.g. , Gaussian) linear transformation, with 
ratio a, of the data, y = X>x^ is observed. We are interested 
in perfectly reconstructing the data from the underdetermined 
measurements. According to CS literature (e.g. , (15)) given 
the ^o-norm decoder 



x = argmin ||x||o subject to y = X>x 



(24) 



then a prefect recovery, x = x K can be achieved for almost 
any x K , with probability 1, if a > k x . This is termed 
weak, or typical noiseless ^o-norm decodable regior^\ Exact 

3 There is also the strong, or worst case, decodable region a > 2k x which 
leads to a correct recovery for any x K . 



reconstruction is impossible with overwhelming probability for 
a < k x , which is the strong converse region. 

Unfortunately, the ^o-norm decoder is prohibitively com- 
plex exhibiting an NP-complete optimization problem since 
it requires combinatorial enumeration of the ( n n ) possible 
sparse vectors. One of the exciting wonders of CS is that the 



£o norm in the optimization problem ( [24] ) could be replaced by 
an £ 1 norm (||x||i = J2i \ x i\)- This replacement turns (24) into 
a tractable optimization problem with polynomial complex- 
ity, but miraculously it still generates perfect reconstruction, 
x = Xfc, with probability 1. However, the feasibility of the 
^i-norm decoder emerges at the cost of more required linear 
measurement (i.e. , larger a per given sparsity k x ). Neverthe- 
less, the ^o-norm decoder is of major theoretical importance as 
it bounds the performance of practical reconstruction methods. 

Moving to the more realistic case of noisy compressed 
sensing, consider an overloaded Gaussian vector channel 



(25) 



where snr is the signal-to-noise ratio (SNR) gain of the 
channel. Examples of such Gaussian vector channel include, to 
name a few, CDMA and MIMO communication systems. For 
CDMA (code-division multiple-access) channel the Bernoulli 
dictionary is mapped onto the spreading matrix, while the 
number of measurements, m, and atoms, n, correspond to 
the processing gain and number of active users, respectively. 
Similarly, for MIMO (multiple-input multiple-output) system 
in Gaussian fading the number of measurements, m, and 
atoms, n, are translated to the number of receiving and 
transmitting antennas, respectively. 

Let z K * be a ft* -sparsest WGN representation. Therefore 
the overloaded Gaussian vector channel can be reformulated 



as 



1 ^(v'snrx^+vW* ) = \ — £>x*, 
m V m 

(26) 



y = \i X>x K +u> 

where x* = (v^snrx^ + V^ z ^* ) ^ s m e sparsest explanation 



of the observations y given the dictionary /channel X>[^Thus 
interestingly based on Claim [2j the noisy channel with k x - 
sparse data input vector may be mapped into an equivalent 
noiseless channel with a denser (k x + ft* — ft^ft* )-sparse input 
vector as shown in Figure [5] The subtraction of ft^ft* accounts 
for the partial overlap between the k x finite non-zero entries 
of the input vector and the ft* infinite entries of WGN sparsest 
representation determined by nature. Although the infinite- 
value entries of z K * , it is noteworthy that the sparsest vector 
x*, which explains the observations y and is generated via 
the ^o-norm optimization problem p4| ), may consist of only 
finite non-zero entries. An extreme example to this observation 
occurs when a sparse (but not the sparsest) represetnation 
of the WGN realization is formed by (k x + ft* — ft^ft* )n 
non-zero entries which are finite (since this representation no 
longer resides on the border of the achievable region) and fully 
covers the indices of the K x n finite non-zero entries of the data 
input. Still in either case the minimal sparsity of the total x* 
remains unchanged. The above argument is summarized in the 
following corollary leaning on Claim [2] and [3] 

Corollary 4. Given a Gaussian vector channel (25) with 
measurement ratio a G (0, 1) and arbitrary snr > 0, then 



in the large-system limit an t^-norm decoder (24) results, 
with probability 1, in (k x + ft* — tv x tv^)-sparse vector with 
support Qq, where k x is the channel input sparsity fraction 
and ft* is the minimal normalized Hamming weight per given 
measurement ratio a. The support, Q, of the data input vector, 
x, maintains Q C Q . 

The following claim states the noisy counterpart to the £o- 
decodable and converse regions as described at the beginning 
of this section for the noiseless case. 

Claim 5. Given a Gaussian vector channel ( [25] ) with measure- 
ment ratio a G (0,1) and arbitrary snr > 0, then in the large- 



system limit an l^-norm decoder (24) results, with probability 
1, in average mean-square error (MSE) per unknown 



x-x 



snr 



as long as 



K x <C 



(27) 



(28) 



for almost any x. Otherwise ^-reconstruction is impossible 
with overwhelming probability. 

Figure [6] draws the sharp threshold (28) and displays the £q- 
decodable region in the (ft x , a) plane for the noisy case. Also 
marked is the classical threshold for the noiseless case. As may 
be expected, the existence of an ambient noise polluting the 
observations increases the threshold on the number of linear 
measurements required for optimal recovery. However, note 
that the noisy threshold itself is insensitive to the SNR level. 
Note also that in the absence of any other information (e.g. , 

4 The y/ot scaling of z K * in {26} is due to the fact that the Gaussian vector 
channel is normalized by y/rn, so as to maintain unity power per atom, while 
the definition |TJ of the sparse representation uses normalization by yfn. 
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Figure 6. Illustration of Claim [5] -decodable (shaded) and non-decodable 
(unshaded) regions for underdetermined Gaussian vector channel with Hx~ 
sparse input and measurement ratio a in the large-system limit. Solid curve 
denotes the sharp noisy threshold delimiting between the two regions. Dashed 
line denotes the classical noiseless threshold. 



1 

0.9 

0.8 

a 0.7 
o 

S 0.6 
| 0.5 

M °" 4 

£ 0.3 
0.2 
0.1 

0, 



NON-ACHIEVABLE REGION^- — 

^^ACHIEVABLE & 

DECODABLE REGION 






NON-DECODABLE REGION 



























0.4 0.5 0.6 
Sparsity fraction k 



Figure 7. Illustration of achievable and decodable regions. 



prior on x w ), reconstruction with MSE which is proportional 
to the SNR, as stated in Claim [5] is the best one can hope for. 
The derived MSE (27) is also comparable to the MSE of an 
oracle decoder, /%/snr, magically knowing the support Q of 
the original data input. 
Proof: 

Combining the classical result from CS theory on the £o- 
norm decodable region for noiseless CS with Corollary [4] 
immediately answers the question of what are the decodable 
and converse regions of ^o-norm decoder for additive WGN 
compressed sensing. As depicted in Figure [7] the ^ - n o rm 
decoder can successfully operate as long as the sum of the 
minimal "physical" sparsity originated from the WGN vector 
realization (on which we have no control), ft*, and the net 
sparsity of the channel input data, k x — ft* k X9 is less than the 
noiseless reconstruction threshold. Thus the operability of the 
^o-norm decoder demands 



< a, 



(29) 



which establishes the threshold ( [28] ). 

Now based on the ^o-norm recovery, the support f^o of the 
sparsest representation, x*, of the observation vector, y, is 



known. Thus eliminating atoms (columns) of the dictionary [15] D. Baron, M. B. Wakin, M. F. Duarte, S. Sarvotham, and R. G. Baraniuk, 
corresponding to indices which are not in Q , the problem "Distributed compressed sensing," Rice University, Tech. Rep., 2005. 

becomes well-posed, and one could optimally reconstruct x 
with the Least-Squares (LS) method 



*LS = < VsnrV~H ~" y . (30) 

[ elsewhere 

The MSE of this LS solution results in ([27]). ■ 

V. Conclusion 

A sharp threshold for the achievability of sparse representa- 
tion of WGN is introduced via Replica method. The marginal 
distribution of such sparse representations is derived, showing 
that the sparsest representation is composed of infinite-value 
entries. Based on this WGN analysis, we have also established 
sharp threshold for ^o-norm decoding in noisy compressed 
sensing and its corresponding MSE. 

Bear in mind that for any orthonormal basis matrix \I> (e.g. , 
DFT matrix) and a Gaussian dictionary X>, the matrix X>\l> 
will be also a Gaussian dictionary, thus the discussed results 
apply for these case too. Extension of this analysis to other 
dictionaries is called for. Also, it may be of a major interest 
to search for applications, insights and consequences of the 
WGN sparse representation analysis to various fields, like data 
hiding, cryptography and watermarking. 
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